K-Medoids For K-Means Seeding

نویسندگان

  • James Newling
  • François Fleuret
چکیده

We run experiments showing that algorithm clarans (Ng et al., 2005) finds better Kmedoids solutions than the standard algorithm. This finding, along with the similarity between the standard K-medoids and K-means algorithms, suggests that clarans may be an effective K-means initializer. We show that this is the case, with clarans outperforming other popular seeding algorithms on 23/23 datasets with a mean decrease over k-means++ (Arthur and Vassilvitskii, 2007) of 30% for initialization mse and 3% for final mse. We describe how the complexity and runtime of clarans can be improved, making it a viable initialization scheme for large datasets.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A simple and fast algorithm for k medoids clustering pdf

This paper proposes a new algorithm for K-medoids clustering which runs like the K-means algorithm and tests several methods for selecting.This paper proposes a new algorithm for K-medoids clustering which runs like the. A new Kmedoids clustering method that should be fast and efficient.

متن کامل

A K-means-like Algorithm for K-medoids Clustering

Clustering analysis is a descriptive task that seeks to identify homogeneous groups of objects based on the values of their attributes. This paper proposes a new algorithm for K-medoids clustering which runs like the K-means algorithm and tests several methods for selecting initial medoids. The proposed algorithm calculates the distance matrix once and uses it for finding new medoids at every i...

متن کامل

Analysis and Implementation of Modified K-medoids Algorithm to Increase Scalability and Efficiency for Large Dataset

Clustering plays a vital role in research area in the field of data mining. Clustering is a process of partitioning a set of data in a meaningful sub classes called clusters. It helps users to understand the natural grouping of cluster from the data set. It is unsupervised classification that means it has no predefined classes. Applications of cluster analysis are Economic Science, Document cla...

متن کامل

A simple and fast algorithm for K-medoids clustering

This paper proposes a new algorithm for K-medoids clustering which runs like the K-means algorithm and tests several methods for selecting initial medoids. The proposed algorithm calculates the distance matrix once and uses it for finding new medoids at every iterative step. To evaluate the proposed algorithm, we use some real and artificial data sets and compare with the results of other algor...

متن کامل

DAGger: Clustering Correlated Uncertain Data

◮ It can compute exact and approximate probabilities with error guarantees for the clustering output. State-of-the-art techniques (e.g. UK-means, UKmedoids, MMVar): ◮ do not support the possible worlds semantics, ◮ lack support for correlations and assume probabilistic independence, ◮ use deterministic cluster medoids or expected means, and ◮ can only compute clustering based on expected distan...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017